智能论文笔记

Curvature-informed multi-task learning for graph networks

Alexander New , Michael J. Pekala , Nam Q. Le , Janna Domenico , Christine D. Piatko , Christopher D. Stiles

分类：机器学习 | 人工智能 | (统计)机器学习

2022-08-02

晶体和分子感兴趣的特性，例如带隙，弹性和溶解度，通常相互关联：它们受相同的基础物理定律的控制。但是，当最新的图形神经网络尝试同时预测多个属性（多任务学习（MTL）设置）时，它们经常表现不佳。这表明图形网络可能无法完全利用这些潜在的相似性。在这里，我们研究了这种现象的潜在解释：每个物业损失表面的曲率都有很大变化，导致学习效率低下。曲率上的这种差异可以通过查看每个属性损耗函数的Hessians的光谱特性来评估，这是通过随机数值线性代数以无基质方式完成的。我们在两个基准数据集（材料项目（MP）和QM8）上评估我们的假设，并考虑这些发现如何为新颖的多任务学习模型的培训提供信息。

translated by 谷歌翻译

Latent Discretization for Continuous-time Sequence Compression

Ricky T. Q. Chen , Matthew Le , Matthew Muckley , Maximilian Nickel , Karen Ullrich

分类：机器学习 | (统计)机器学习

2022-12-28

Neural compression offers a domain-agnostic approach to creating codecs for lossy or lossless compression via deep generative models. For sequence compression, however, most deep sequence models have costs that scale with the sequence length rather than the sequence complexity. In this work, we instead treat data sequences as observations from an underlying continuous-time process and learn how to efficiently discretize while retaining information about the full sequence. As a consequence of decoupling sequential information from its temporal discretization, our approach allows for greater compression rates and smaller computational complexity. Moreover, the continuous-time approach naturally allows us to decode at different time intervals. We empirically verify our approach on multiple domains involving compression of video and motion capture sequences, showing that our approaches can automatically achieve reductions in bit rates by learning how to discretize.

translated by 谷歌翻译

Stag hunt game-based approach for cooperative UAVs

L. V. Nguyen , I. Torres Herrera , T. H. Le , M. D. Phung , R. P. Aguilera , Q. P. Ha

分类：机器人

2022-08-29

无人驾驶汽车（UAV）在许多领域都受雇于摄影，紧急，娱乐，国防，农业，林业，采矿和建筑。在过去的十年中，无人机技术在许多施工项目阶段中找到了应用程序，从现场映射，进度监控，建筑物检查，损坏评估和材料交付等等。尽管已经对无人机在各种施工相关的过程中的优势进行了广泛的研究，但关于提高任务能力和效率的无人机协作的研究仍然很少。本文提出了一种基于塔格狩猎游戏和粒子群优化（PSO）的多个无人机的新合作路径计划算法。首先，定义了每个无人机的成本函数，并包含多个目标和约束。然后，开发了无人机游戏框架，以将多功能路径计划制定到寻找回报优势均衡的问题。接下来，提出了基于PSO的算法来获得无人机的最佳路径。由三个无人机检查的大型建筑工地的仿真结果表明，在检查任务期间，提出的算法在为无人机形成的可行和高效飞行路径生成可行，高效的飞行路径上的有效性。

translated by 谷歌翻译

An Accurate and Explainable Deep Learning System Improves Interobserver Agreement in the Interpretation of Chest Radiograph

Hieu H. Pham , Ha Q. Nguyen , Hieu T. Nguyen , Linh T. Le , Lam Khanh

分类：计算机视觉

2022-08-06

最近的人工智能（AI）算法已在各种医学分类任务上实现了放射科医生级的性能。但是，只有少数研究涉及CXR扫描异常发现的定位，这对于向放射学家解释图像级分类至关重要。我们在本文中介绍了一个名为Vindr-CXR的可解释的深度学习系统，该系统可以将CXR扫描分类为多种胸部疾病，同时将大多数类型的关键发现本地化在图像上。 Vindr-CXR接受了51,485次CXR扫描的培训，并通过放射科医生提供的边界盒注释进行了培训。它表现出与经验丰富的放射科医生相当的表现，可以在3,000张CXR扫描的回顾性验证集上对6种常见的胸部疾病进行分类，而在接收器操作特征曲线（AUROC）下的平均面积为0.967（95％置信区间[CI]：0.958---------0.958------- 0.975）。 VINDR-CXR在独立患者队列中也得到了外部验证，并显示出其稳健性。对于具有14种类型病变的本地化任务，我们的自由响应接收器操作特征（FROC）分析表明，VINDR-CXR以每扫描确定的1.0假阳性病变的速率达到80.2％的敏感性。还进行了一项前瞻性研究，以衡量VINDR-CXR在协助六名经验丰富的放射科医生方面的临床影响。结果表明，当用作诊断工具时，提出的系统显着改善了放射科医生本身之间的一致性，平均Fleiss的Kappa的同意增加了1.5％。我们还观察到，在放射科医生咨询了Vindr-CXR的建议之后，在平均Cohen的Kappa中，它们和系统之间的一致性显着增加了3.3％。

translated by 谷歌翻译

DiffFace: Diffusion-based Face Swapping with Facial Guidance

Kihong Kim , Yunho Kim , Seokju Cho , Junyoung Seo , Jisu Nam , Kychul Lee , Seungryong Kim , KwangHee Lee

分类：计算机视觉

2022-12-27

In this paper, we propose a diffusion-based face swapping framework for the first time, called DiffFace, composed of training ID conditional DDPM, sampling with facial guidance, and a target-preserving blending. In specific, in the training process, the ID conditional DDPM is trained to generate face images with the desired identity. In the sampling process, we use the off-the-shelf facial expert models to make the model transfer source identity while preserving target attributes faithfully. During this process, to preserve the background of the target image and obtain the desired face swapping result, we additionally propose a target-preserving blending strategy. It helps our model to keep the attributes of the target face from noise while transferring the source facial identity. In addition, without any re-training, our model can flexibly apply additional facial guidance and adaptively control the ID-attributes trade-off to achieve the desired results. To the best of our knowledge, this is the first approach that applies the diffusion model in face swapping task. Compared with previous GAN-based approaches, by taking advantage of the diffusion model for the face swapping task, DiffFace achieves better benefits such as training stability, high fidelity, diversity of the samples, and controllability. Extensive experiments show that our DiffFace is comparable or superior to the state-of-the-art methods on several standard face swapping benchmarks.

translated by 谷歌翻译

Deployment of UAVs for Optimal Multihop Ad-hoc Networks Using Particle Swarm Optimization and Behavior-based Control

Ngan Duong Thi Thuy , Duy Nam Bui , Manh Duong Phung , Hung Pham Duy

分类：机器人

2022-12-26

This study proposes an approach for establishing an optimal multihop ad-hoc network using multiple unmanned aerial vehicles (UAVs) to provide emergency communication in disaster areas. The approach includes two stages, one uses particle swarm optimization (PSO) to find optimal positions to deploy UAVs, and the other uses a behavior-based controller to navigate the UAVs to their assigned positions without colliding with obstacles in an unknown environment. Several constraints related to the UAVs' sensing and communication ranges have been imposed to ensure the applicability of the proposed approach in real-world scenarios. A number of simulation experiments with data loaded from real environments have been conducted. The results show that our proposed approach is not only successful in establishing multihop ad-hoc routes but also meets the requirements for real-time deployment of UAVs.

translated by 谷歌翻译

Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images

Mohammad Mahdi Behzadi , Mohammad Madani , Hanzhang Wang , Jun Bai , Ankit Bhardwaj , Anna Tarakanova , Harold Yamase , Ga Hie Nam , Sheida Nabavi

分类：计算机视觉

2022-12-25

Prostate cancer is the most common cancer in men worldwide and the second leading cause of cancer death in the United States. One of the prognostic features in prostate cancer is the Gleason grading of histopathology images. The Gleason grade is assigned based on tumor architecture on Hematoxylin and Eosin (H&E) stained whole slide images (WSI) by the pathologists. This process is time-consuming and has known interobserver variability. In the past few years, deep learning algorithms have been used to analyze histopathology images, delivering promising results for grading prostate cancer. However, most of the algorithms rely on the fully annotated datasets which are expensive to generate. In this work, we proposed a novel weakly-supervised algorithm to classify prostate cancer grades. The proposed algorithm consists of three steps: (1) extracting discriminative areas in a histopathology image by employing the Multiple Instance Learning (MIL) algorithm based on Transformers, (2) representing the image by constructing a graph using the discriminative patches, and (3) classifying the image into its Gleason grades by developing a Graph Convolutional Neural Network (GCN) based on the gated attention mechanism. We evaluated our algorithm using publicly available datasets, including TCGAPRAD, PANDA, and Gleason 2019 challenge datasets. We also cross validated the algorithm on an independent dataset. Results show that the proposed model achieved state-of-the-art performance in the Gleason grading task in terms of accuracy, F1 score, and cohen-kappa. The code is available at https://github.com/NabaviLab/Prostate-Cancer.

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

Does unsupervised grammar induction need pixels?

Boyi Li , Rodolfo Corona , Karttikeya Mangalam , Catherine Chen , Daniel Flaherty , Serge Belongie , Kilian Q. Weinberger , Jitendra Malik , Trevor Darrell , Dan Klein

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-20

Are extralinguistic signals such as image pixels crucial for inducing constituency grammars? While past work has shown substantial gains from multimodal cues, we investigate whether such gains persist in the presence of rich information from large language models (LLMs). We find that our approach, LLM-based C-PCFG (LC-PCFG), outperforms previous multi-modal methods on the task of unsupervised constituency parsing, achieving state-of-the-art performance on a variety of datasets. Moreover, LC-PCFG results in an over 50% reduction in parameter count, and speedups in training time of 1.7x for image-aided models and more than 5x for video-aided models, respectively. These results challenge the notion that extralinguistic signals such as image pixels are needed for unsupervised grammar induction, and point to the need for better text-only baselines in evaluating the need of multi-modality for the task.

translated by 谷歌翻译

DSI++: Updating Transformer Memory with New Documents

Sanket Vaibhav Mehta , Jai Gupta , Yi Tay , Mostafa Dehghani , Vinh Q. Tran , Jinfeng Rao , Marc Najork , Emma Strubell , Donald Metzler

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-19

Differentiable Search Indices (DSIs) encode a corpus of documents in the parameters of a model and use the same model to map queries directly to relevant document identifiers. Despite the strong performance of DSI models, deploying them in situations where the corpus changes over time is computationally expensive because reindexing the corpus requires re-training the model. In this work, we introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents. Across different model scales and document identifier representations, we show that continual indexing of new documents leads to considerable forgetting of previously indexed documents. We also hypothesize and verify that the model experiences forgetting events during training, leading to unstable learning. To mitigate these issues, we investigate two approaches. The first focuses on modifying the training dynamics. Flatter minima implicitly alleviate forgetting, so we optimize for flatter loss basins and show that the model stably memorizes more documents (+12\%). Next, we introduce a generative memory to sample pseudo-queries for documents and supplement them during continual indexing to prevent forgetting for the retrieval task. Extensive experiments on novel continual indexing benchmarks based on Natural Questions (NQ) and MS MARCO demonstrate that our proposed solution mitigates forgetting by a significant margin. Concretely, it improves the average Hits@10 by $+21.1\%$ over competitive baselines for NQ and requires $6$ times fewer model updates compared to re-training the DSI model for incrementally indexing five corpora in a sequence.

translated by 谷歌翻译